Rhythmic unit extraction and modelling for automatic language identification

نویسندگان

  • Jean-Luc Rouas
  • Jérôme Farinas
  • François Pellegrino
  • Régine André-Obrecht
چکیده

This paper deals with an approach to automatic language identification based on rhythmic modelling. Beside phonetics and phonotactics, rhythm is actually one of the most promising features to be considered for language identification, even if its extraction and modelling are not a straightforward issue. Actually, one of the main problems to address is what to model. In this paper, an algorithm of rhythm extraction is described: using a vowel detection algorithm, rhythmic units related to syllables are segmented. Several parameters are extracted (consonantal and vowel duration, cluster complexity) and modelled with a Gaussian Mixture. Experiments are performed on read speech for seven languages (English, French, German, Italian, Japanese, Mandarin and Spanish) and results reach up to 86 ± 6% of correct discrimination between stress-timed mora-timed and syllable-timed classes of languages, and to 67 ± 8% of correct language identification on average for the seven languages with utterances of 21 s. These results are commented and compared with those obtained with a standard acoustic Gaussian mixture modelling approach (88 ± 5% of correct identification for the seven languages identification task). 2005 Published by Elsevier B.V.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Modelling of Rhythm and Intonation for Language Identification

This paper deals with an approach to Automatic Language Identification using only prosodic modeling. The traditional approach for language identification focuses mainly on phonotactics because it gives the best results. Recent studies reveal that humans use different levels of perception to identify a language, in particular prosodic cues. Among prosodic features, rhythm is known to carry a sub...

متن کامل

Title: Rhythmic Unit Extraction and Modelling for Automatic Language Identification

Authors : Jean-Luc Rouas, Jérôme Farinas, François Pellegrino, Régine André-Obrecht 1 Institut de Recherche en Informatique de Toulouse UMR 5505 CNRS – Institut National Polytechique de Toulouse – Université Paul Sabatier – Université Toulouse 1, France 2 Laboratoire Dynamique Du Langage UMR 5596 CNRS – Université Lumière Lyon 2, France {[email protected], [email protected], Francois.Pellegrin...

متن کامل

Can Automatically Extracted Rhythmic Units Discriminate among Languages?

This paper deals with rhythmic modeling and its application to language identification. Beside phonetics and phonotactics, rhythm is actually one of the most promising features to be considered for language identification, but significant problems are unresolved for its modeling. In this paper, an algorithm dedicated to rhythmic segmentation is described. Experiments are performed on read speec...

متن کامل

Kohonen Self Organizing for Automatic Identification of Cartographic Objects

Automatic identification and localization of cartographic objects in aerial and satellite images have gained increasing attention in recent years in digital photogrammetry and remote sensing. Although the automatic extraction of man made objects in essence is still an unresolved issue, the man made objects can be extracted from aerial photos and satellite images. Recently, the high-resolution s...

متن کامل

Automatic rhythm modeling for language identification

This paper deals with an approach to Automatic Language Identification based on rhythmic modeling. Beside phonetics and phonotactics, rhythm is actually one of the most promising features to be considered for language identification, but significant problems are unresolved for its modeling. In this paper, an algorithm of rhythm extraction is described. Experiments are performed on read speech f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Speech Communication

دوره 47  شماره 

صفحات  -

تاریخ انتشار 2005